AITopics | Mercer County

Collaborating Authors

Mercer County

Instead of Taking Your Job, A.I. Might Transform It

The New YorkerJun-5-2026, 20:24:36 GMT

Proponents and critics of artificial intelligence often compare the technology to industrial automation--really, it's more like an intern. One summer during high school, I took a temporary job writing computer programs for a consulting firm. Each morning, I drove through rush-hour traffic to an office park near Princeton, New Jersey, on the crowded Route 1 corridor. At a desk in some sort of equipment room, I coded quick-and-dirty database tools for internal use. One of my programs simplified the process of logging hours into timesheets.

culture fiction & poetry humor, large language model, natural language, (10 more...)

The New Yorker

Country: North America > United States > New Jersey > Mercer County > Princeton (0.24)

Industry:

Information Technology (1.00)
Banking & Finance (0.69)
Transportation > Ground > Road (0.54)
Education > Educational Setting (0.49)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.32)

Add feedback

Momentum Further Constrains Sharpness at the Edge of Stochastic Stability

Andreyev, Arseniy, Ananthkumar, Advikar, Walden, Marc, Poggio, Tomaso, Beneventano, Pierfrancesco

arXiv.org Machine LearningApr-16-2026

Recent work suggests that (stochastic) gradient descent self-organizes near an instability boundary, shaping both optimization and the solutions found. Momentum and mini-batch gradients are widely used in practical deep learning optimization, but it remains unclear whether they operate in a comparable regime of instability. We demonstrate that SGD with momentum exhibits an Edge of Stochastic Stability (EoSS)-like regime with batch-size-dependent behavior that cannot be explained by a single momentum-adjusted stability threshold. Batch Sharpness (the expected directional mini-batch curvature) stabilizes in two distinct regimes: at small batch sizes it converges to a lower plateau $2(1-β)/η$, reflecting amplification of stochastic fluctuations by momentum and favoring flatter regions than vanilla SGD; at large batch sizes it converges to a higher plateau $2(1+β)/η$, where momentum recovers its classical stabilizing effect and favors sharper regions consistent with full-batch dynamics. We further show that this aligns with linear stability thresholds and discuss the implications for hyperparameter tuning and coupling.

artificial intelligence, machine learning, regime, (17 more...)

arXiv.org Machine Learning

2604.14108

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
Asia > Middle East > Jordan (0.04)
North America > United States > New Jersey > Mercer County > Princeton (0.04)

Genre: Research Report (0.85)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.48)

Add feedback

Deep Learning for Sequential Decision Making under Uncertainty: Foundations, Frameworks, and Frontiers

Buyuktahtakin, I. Esra

arXiv.org Machine LearningApr-14-2026

Artificial intelligence (AI) is moving increasingly beyond prediction to support decisions in complex, uncertain, and dynamic environments. This shift creates a natural intersection with operations research and management sciences (OR/MS), which have long offered conceptual and methodological foundations for sequential decision-making under uncertainty. At the same time, recent advances in deep learning, including feedforward neural networks, LSTMs, transformers, and deep reinforcement learning, have expanded the scope of data-driven modeling and opened new possibilities for large-scale decision systems. This tutorial presents an OR/MS-centered perspective on deep learning for sequential decision-making under uncertainty. Its central premise is that deep learning is valuable not as a replacement for optimization, but as a complement to it. Deep learning brings adaptability and scalable approximation, whereas OR/MS provides the structural rigor needed to represent constraints, recourse, and uncertainty. The tutorial reviews key decision-making foundations, connects them to the major neural architectures in modern AI, and discusses leading approaches to integrating learning and optimization. It also highlights emerging impact in domains such as supply chains, healthcare and epidemic response, agriculture, energy, and autonomous operations. More broadly, it frames these developments as part of a wider transition from predictive AI toward decision-capable AI and highlights the role of OR/MS in shaping the next generation of integrated learning--optimization systems.

machine learning, reinforcement learning, urlhttp, (19 more...)

arXiv.org Machine Learning

2604.11507

Country:

North America > United States > New Jersey > Hudson County > Hoboken (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > Massachusetts > Middlesex County > Belmont (0.04)
(7 more...)

Genre: Instructional Material > Course Syllabus & Notes (1.00)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)
Energy (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

Completely random measures for modelling block-structured sparse networks

Tue Herlau, Mikkel N. Schmidt, Morten Mørup

Neural Information Processing SystemsMar-23-2026, 06:52:33 GMT

Neural Information Processing Systems http://nips.cc/

caron & fox, random measure, representation, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > New Jersey > Mercer County > Princeton (0.04)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
Europe > Denmark > Capital Region > Kongens Lyngby (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

Approximate maximum entropy principles via Goemans-Williamson with applications to provable variational methods

Andrej Risteski, Yuanzhi Li

Neural Information Processing SystemsMar-23-2026, 00:21:27 GMT

Neural Information Processing Systems http://nips.cc/

entropy, relaxation, variational method, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > New Jersey > Mercer County > Princeton (0.04)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
Asia > Middle East > Jordan (0.04)
Asia > Afghanistan > Parwan Province > Charikar (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Maximum Entropy (0.44)

Add feedback

Kriging via variably scaled kernels

Audone, Gianluca, Marchetti, Francesco, Perracchione, Emma, Rossini, Milvia

arXiv.org Machine LearningMar-19-2026

Classical Gaussian processes and Kriging models are commonly based on stationary kernels, whereby correlations between observations depend exclusively on the relative distance between scattered data. While this assumption ensures analytical tractability, it limits the ability of Gaussian processes to represent heterogeneous correlation structures. In this work, we investigate variably scaled kernels as an effective tool for constructing non-stationary Gaussian processes by explicitly modifying the correlation structure of the data. Through a scaling function, variably scaled kernels alter the correlations between data and enable the modeling of targets exhibiting abrupt changes or discontinuities. We analyse the resulting predictive uncertainty via the variably scaled kernel power function and clarify the relationship between variably scaled kernels-based constructions and classical non-stationary kernels. Numerical experiments demonstrate that variably scaled kernels-based Gaussian processes yield improved reconstruction accuracy and provide uncertainty estimates that reflect the underlying structure of the data

artificial intelligence, machine learning, modeling & simulation, (18 more...)

arXiv.org Machine Learning

2603.1695

Country:

North America > United States > Wisconsin > Dane County > Madison (0.04)
North America > United States > Oregon (0.04)
North America > United States > New Jersey > Mercer County > Princeton (0.04)
(5 more...)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Modeling & Simulation (0.96)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)

Add feedback

A New Kernel Regularity Condition for Distributed Mirror Descent: Broader Coverage and Simpler Analysis

Qiu, Junwen, Zeng, Ziyang, Mei, Leilei, Zhang, Junyu

arXiv.org Machine LearningMar-16-2026

Existing convergence of distributed optimization methods in non-Euclidean geometries typically rely on kernel assumptions: (i) global Lipschitz smoothness and (ii) bi-convexity of the associated Bregman divergence function. Unfortunately, these conditions are violated by nearly all kernels used in practice, leaving a huge theory-practice gap. This work closes this gap by developing a unified analytical tool that guarantees convergence under mild conditions. Specifically, we introduce Hessian relative uniform continuity (HRUC), a regularity satisfied by nearly all standard kernels. Importantly, HRUC is closed under concatenation, positive scaling, composition, and various kernel combinations. Leveraging the geometric structure induced by HRUC, we derive convergence guarantees for mirror descent-based gradient tracking without imposing any restrictive assumptions. More broadly, our analysis techniques extend seamlessly to other decentralized optimization methods in genuinely non-Euclidean and non-Lipschitz settings.

artificial intelligence, kernel, optimization problem, (16 more...)

arXiv.org Machine Learning

2603.12838

Country:

North America > United States > New Jersey > Mercer County > Princeton (0.04)
Asia > Singapore (0.04)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)

Add feedback

Near-Optimal Time and Sample Complexities for Solving Markov Decision Processes with a Generative Model

Aaron Sidford, Mengdi Wang, Xian Wu, Lin Yang, Yinyu Ye

Neural Information Processing SystemsMar-13-2026, 14:20:54 GMT

Computing an approximately optimal policy with high probability in this case is known as PAC RL with a generative model.

algorithm, artificial intelligence, machine learning, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.05)
North America > United States > New Jersey > Mercer County > Princeton (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.47)

Add feedback

Learning Shortest Paths with Generative Flow Networks

Morozov, Nikita, Maksimov, Ian, Tiapkin, Daniil, Samsonov, Sergey

arXiv.org Machine LearningMar-3-2026

In this paper, we present a novel learning framework for finding shortest paths in graphs utilizing Generative Flow Networks (GFlowNets). First, we examine theoretical properties of GFlowNets in non-acyclic environments in relation to shortest paths. We prove that, if the total flow is minimized, forward and backward policies traverse the environment graph exclusively along shortest paths between the initial and terminal states. Building on this result, we show that the pathfinding problem in an arbitrary graph can be solved by training a non-acyclic GFlowNet with flow regularization. We experimentally demonstrate the performance of our method in pathfinding in permutation environments and in solving Rubik's Cubes. For the latter problem, our approach shows competitive results with state-of-the-art machine learning approaches designed specifically for this task in terms of the solution length, while requiring smaller search budget at test-time.

machine learning, natural language, shortest path, (18 more...)

arXiv.org Machine Learning

2603.01786

Country: